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ABSTRACT 

Based on the work by M itra, Choudhury & Ferrara (20 IP} , we obtain model-independent con- 
straints on reionization from cosmic microwave background (CMB) and QSO absorption 
line data by decomposing the function Ni on (z) (the number of photons entering the IGM 
per baryon in collapsed objects) into its principal components. The main addition in this 
work is that for the CMB data set, we explicitly include the angular power spectra C/ for 
TT, TE and EE modes in our analysis which seem to contain somewhat more information 
than taking the electron scattering optical depth r e i as a single data point. Using Markov 
Chain Monte Carlo methods, we find that all the quantities related to reionization can be 
severely constrained at z < 6 whereas a broad range of reionization histories at z > 6 are 
still permitted by the current data sets. With currently available data from WMAP7, we con- 
strain 0.080 < T e i < 0.112 (95% CL) and also conclude that reionization is 50% com- 
plete between 9.0 < z(Qhii = 0.5) < 11.8 (95% CL) and is 99% complete between 
5.8 < z(Q m i = 0.99) < 10.4 (95% CL). With the forthcoming PLANCK data on large- 
scale polarization (ignoring effect of foregrounds), the z > 6 constraints will be improved 
considerably, e.g., the 2 — a error on t c \ will be reduced to 0.009 and the uncertainties on 
z{Qm\ = 0.5) and z(Qhii = 0.99) would be ~ 1 and 3 (95% CL), respectively. For more 
stringent constraints on reionization at z > 6, one has to rely on data sets other than CMB. 
Our method will be useful in such case since it can be used for non-parametric reconstruction 
of reionization history with arbitrary data sets. 
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1 INTRODUCTION 

In the past few years, the understanding of reionization process has 
become increasingly sophisticated in both the observational and 
theoretical communities (for reviews, see, Loeb&Barkana 2001; 

|Choudhury & Ferrara 2006a) 

Choudhury 2009 



Furlanetto, Oh & Briggs 2006 , 
Fan, Carilli, & Keating 2006 and the references therein), thanks to 



the availability of good quality data related to reionization. Mainly, 
the observations by the Wilkinson Microwave Anisotropy Probe 
(WMAP) satellite of cosmic microwave background (CMB) and 
highest redshift QSOs put very tight constraints on the reionization 
history of the universe. The WMAP seven-year observation man- 
ifests the Thomson scattering optical depth r e j = 0.088 ± 0.015 
( ILarson et al. 20101 with the simple assumption that the universe 
was reionized instantaneously. However, recent studies suggest 
that reionization process is too complex to be described as a sudden 
process. In fact, the physical processes relevant to reionization 
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are so complex that neither the analytical nor the numerical 
simulations alone can capture the overall picture. That's why, it is 
often studied using semi-analytical models of reionization, with 
limited computational resources. 

The major uncertainty in modeling any semi-analytical reion- 
ization scenario is to model the parameter Mon, the num- 
ber of photons entering the IGM per baryon in collapsed ob- 
jects, which can be a function of redshift z. In analytical stud- 
ies, Ni on (z) is either taken to be a piecewise constant func- 
tion ([Wyithe & Loeb 2003] |Choudhury & Ferrara 2005), parame- 
terized using some known functions jChiu, Fan, & Ostriker 2003 [ 
|Pritchard, Loeb, & Wyithe 2010[>, modeled using a physically- 
motivated prescription ( [Choudhury & Ferr ara 2006b), or taken 
to be an arbitrary function of z and decomposed into its 
principal components using the principal component analysis 
(Mitra, Choudhury & Ferrara 2010. hereafter Paper I). 

The principal component method has been applied to study 
the constraints on reionization from large-scale CMB polariza- 
tion dMortonson & Hu 20081 IHu & Holder 2003 1 . It is well es- 
tablished that, the inhomogeneity signature of reionization is 
expected to contribute to the CMB temperature and polariza- 
tion anisotropies JHu 2000llSalvaterra et al. 2005lllEev et al. 20061 
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IMorto nson & Hu 2007 ). In fact, the CMB power spectra contain 
more information than the optical depth integrated over the whole 
ionization history (Hu & Holder 2003 1. So it is worth asking what 
we can ultimately expect to learn about the reionization model with 
principal component technique from the current CMB data sets in- 
stead of single optical depth data. 

In our previous work, we made a preliminary attempt 
to constrain Ni on (z) using PC A and estimated the uncertain- 
ties in the reionization history. The main difference of our 
work with other PCA of reionization history using CMB data 
JMortonson & Hu 20081 fHu & Holder 20031 is that we use a self- 
consistent model of reionization and include data sets other than 
CMB (e.g., QSO absorption lines) in the analysis. Such an analysis 
should give us a handle in not only constraining the evolution of 
the electron fraction x e (z) (as is done in usual reionization related 
studies using CMB data) but also in constraining the evolution of 
source properties like galactic IMF, star-formation history, and es- 
cape fraction of ionizing photons. 

In Paper I, we found that to model Ni on (z) over the range 
2 < z < 14 one should include the first 5 principal compo- 
nents with smaller uncertainties. We concluded that a wide range of 
reionization scenarios are allowed by the data sets of photoioniza- 
tion rates, redshift distribution of Lyman-limit systems and the elec- 
tron scattering optical depth from WMAP7. In this paper, we ex- 
tend our previous work to study the effect of inclusion of the angu- 
lar power spectra Ci of the CMB temperature (T) and polarization 
(E) modes. Using the available WMAP7 data on £' ; TT > TE ' EE j we 
study the present constraints on reionization history. We also fore- 
cast the errors on reionization history as would be determined by 
future observations of large scale polarization signal by PLANCKq 

The paper is organized as follows. In Section [2] we discuss 
about the features of the semi-analytical model of reionization and 
its modifications for including CMB data. We also outline the ba- 
sic theory of the principal component analysis in this section. We 
describe our results of the principal component approach to reion- 
ization model with large-scale ii-mode data in Section [3] In this 
Section, using the Markov Chain Monte Carlo methods, we exam- 
ine our model for both 7-year WMAP data and simulated PLANCK 
data. Finally we summarize our main findings and conclude in Sec- 
tion!] 



2 MODEL AND METHOD OF STATISTICAL ANALYSIS 

2.1 Semi-analytical model of reionization with PCA 

We first describe the method used in our previous work 
(Paper I) which was based on the semi-analytical model 
of reionization developed in |Choudhury & Ferrara (2006b) and 
|Choudhury & Ferrara (2005) . The main features of the model are: 

• The model follows the ionization and thermal histories 
of neutral, HII and Helll regions simultaneously and self- 
consistently taking the IGM inhomogeneities by adopting a 
lognormal distribution according to the method outlined in 
|Miralda-Escude, Haehnelt, & Rees (2000| >. 

• Given the collapsed fraction / co n of dark matter haloes, this 
model calculates the production rate of ionizing photons in the IGM 
as 

n ph (z) = d ^ c °" (1) 

1 http://www.esa.int/SPECIALS/Planck/index.html 



where rib is the total baryonic number density in the IGM, Mon is 
the number of photons entering the IGM per baryon in collapsed 
objects. The parameter iVion can actually be written as a com- 
bination of various other parameters which characterize the star- 
forming efficiency (fraction of baryons within collapsed haloes go- 
ing into stars), the fraction of photons escaping into the IGM, and 
the number of photons emitted per frequency range per unit mass 
of stars (which depends on the stellar IMF and the corresponding 
stellar spectrum). 

• The model computes radiative feedback (suppressing star for- 
mation in low-mass haloes using a Jeans mass prescription) self- 
consistently from the evolution of the thermal properties of the 
IGM. The corresponding filtering scale, which depends on the tem- 
perature evolution of the IGM, is found to be typically around 
~ 30 km s _1 . We should mention here that minimum mass of 
star-forming haloes is much larger in ionized regions than in the 
neutral regions because of this radiative feedback. For that, this 
model takes the filter mass for ionized region and atomic cooling 
(i.e. small halo) for the neutral region. 

• In Paper I, we assume 7Vi on to be an unknown function of z 
and decompose it into principal components. These principal com- 
ponents essentially filter out components of the model which are 
most sensitive to the data and thus they are the ones which can be 
constrained most accurately. We carry out our analysis assuming 
that only one population of stars contribute to the ionizing radia- 
tion; any change in the characteristics of these stars over time or 
the chemical feedback prescription would be accounted for indi- 
rectly by the evolution of JVi on . We also include the contribution of 
quasars at z < 6 assuming that they have negligible effects on IGM 
at higher redshifts, but are significant sources of photons at z < 4. 

• Usually, the model is constrained by comparing with a vari- 
ety of observational data, but to keep the analysis simple, we used 
the three main data sets in our earlier work, namely, the photoion- 
ization rates Tpi obtained using Lya forest Gunn-Peterson opti- 
cal depth observations and a large set of hydrodynamical simula- 
tions (Bolton & Haehnelt 2007), the redshift distribution of LLS 
dN LL /dz at z ~ 3.5 ( |Prochaska, O'Meara, & Worseck 2010| > 
and the WMAP7 data on electron scattering optical depth r e i 
( ILarson et al. 20101 . It should be mentioned that in this work, we 
have used the data on Ci's rather than the constraints on Tel, which 
will be described in the next subsection. 

• The free parameters used in the model are the coefficients re- 
lated to the principal components of Ni on and Ao (the normalization 
which determines the mean free path of photons). The constraints 
on A'ion were obtained by marginalizing over Ao. The cosmologi- 
cal parameters were taken to be fixed (given by the best-fit WMAP7 
values) and not varied at all. 

2.2 Data sets and free parameters 

The major modifications made in this work compared to our pre- 
vious one are related to how we treat the CMB data sets. Note 
that in Paper I, r e i constraint was treated as a single data point 
which can be thought as a simplification of the CMB polarization 
observations at low multipole moments (Burigana et al. 2008). We 
know that, the amplitude of fluctuations in the large-scale (low-?) 
£-mode component of CMB polarization provides the current best 
constraint on r e i. Using the data from seven year WMAP and the 
assumption of instantaneous reionization, Larson et al. (20 1JJ) find 
t i = 0.088 ± 0.015. However, recent theoretical and numerical 
studies suggest that reionization is a fairly complex process. In that 
case, the low-i £-mode spectrum depends not just on r c i but also on 
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the detailed redshift evolution of the number density of free elec- 
trons in the IGM, x e (z). For fixed values of r e i and all other rel- 
evant cosmological parameters, differences in x s (z) can affect the 
shape of the large-scale E-mode angular power spectrum up to mul- 
tipoles Z ~ 40 — 50. Because of this dependence, measurements of 
the low-Z C EE should place at least weak constraints on the overall 
reionization history in addition to the constraint on the total optical 
depth. 

Now, in our model, the change in the parameter Ni on (z) di- 
rectly corresponds to the change in x e (z) i.e. in other words, 
changes in Mon can affect the shape of low-Z C EE . So, incorpo- 
rating the data sets for large-scale EE polarization signal in our 
model can provide important information about the evolution of 
iVion at z > 6 beyond the information about r e i. Our hope is this 
may be most useful for distinguishing the models of reionization 
with different ionization histories but same optical depth. Keeping 
this in mind, it would be more prudent to work with the actual data 
related to the angular power spectra Ci and obtain constraints on 
reionization parameters; the constraint on r c i will be determined a 
posteriori. 

The moment we include the Ci's (TT+TE+EE) in our anal- 
ysis, we realize that parameters related to reionization may have 
strong degeneracies with (some of) the cosmological parameters 
and hence constraints on reionization without varying cosmologi- 
cal parameters would be misleading. On the other hand, including 
all the cosmological parameters in the analysis would increase the 
number of free parameters to a large number. Usually, it is found 
that r e i is strongly degenerate with the normalization of the matter 
power spectrum as and also with the slope n s ( Spergel et al. 2003 [ >. 
Hence, it may be worthwhile to verify whether we can carry out our 
analysis by varying only these two parameters (in addition to the 
parameters related to reionization model) and keeping all the other 
cosmological parameters fixed to their mean value. 

To verify the viability of this method, we re-do the anal- 
ysis of WMAP7 data with instantaneous reionization history 
(as in ILarson et al. 20101 We assume the universe to be de- 
scribed by a flat cold dark matter model with a cosmologi- 
cal constant (ACDM) which is parametrized by six parameters 
(Qbh 2 , JIdm/i 2 , Ho, n s , as, r e i). We then carry out the standard 
MCMC analysis ( Verde et a l~2003l > first varying all six parame- 
ters and then keeping all but as, n s and r e i fixed to their best-fit 
values. The results are shown in Table [T] It is clear that though 
the uncertainties on n s and as are reduced considerably because 
of not varying the other three parameters, the constraints on r e i 
are relatively unchanged. There is only a slight (< 15 percent) 
decrease in the error-bars, thus indicating that the parameters re- 
lated to reionization are only moderately degenerate with the other 
cosmological parameters. Hence, we can carry our analysis with 
the other cosmological parameters fixed keeping in mind that the 
uncertainties in reionization history would possibly be slightly 
underestimated. This approach is similar to what is adopted by 
|Mortonson & Hu (2007) . 

In addition to the CMB data, we have also in- 
cluded the more recent measurements of dN^/dz by 
Songaila & Cow ie (2010| > instead of the previous data by 
Prochaska, O'Meara, & Worseck (2010). The new data set in- 
cludes observations over a wide redshift range (0.36 < z < 6) and 
is well suited for studying the evolution of reionization. 

The likelihood function used in our calculations is given by 



Parameters Mean value and 1 — a errors 

varying all 6 parameters varying only 3 parameters 



Q b h 2 x 10 2 


2 24O+0-056 


2.249 (fixed) 


!!dm'i 2 


1120+ ' 0056 

u.iizu- .o 056 


0.1120 (fixed) 


Ho 


70A+H 


70.4 (fixed) 


n„ 


Q67 +o bl4 
u - 90 ' -0.014 


n QfiO+0 007 


t?8 


n S1 , +0.030 


0.816±8;gl| 


Tel 


n nss+ - 007 


0.088±° ;° O ? 



Table 1. Mean value of parameters and the corresponding errors for a flat 
ACDM cosmological model with instantaneous reionization. The results 
are shown when all six parameters are varied and when all but n s ,as and 
T e i are kept fixed to their mean values. 



where C is the negative of the log-likelihood and estimated using 
the relation 



7" 



(3) 



where J a represents the set of iV bs observational data points re- 
lated to photoionization rate and distribution of Lyman-limit sys- 
tems, i.e., J a — {log(rpi), dA^LL/dz}, a a are the correspond- 
ing observational error-bars and £.' is negative of WMAP7 or 
PLANCK log-likelihood function for Cf T , C, TE and Cf E up to 
/ = 2000. We constrain the free parameters by maximizing the 
likelihood function with a prior that reionization should be com- 
pleted by z = 5.8, otherwise it will not match Lya and Ly/3 forest 
transmitted flux data. 

In this work, we calculate likelihoods using the code de- 
scribed in Paper I which is essentially based on the pub- 
licly available COSMOMC0 dLewis & Bridle 2002t code. Be- 
sides this, throughout we work in a flat cold dark mat- 
ter model with a cosmological constant (ACDM) cosmol- 
ogy with the cosmological parameters given by the current 
WMAP7 (based on RECFAST 1. 5 l|Seager, Sasselov & Scott 1996| 
|Seager, Sasselov & Scott 20001 |Wong, Moss & Scott 2008r ~and 
version 4. 1 of the WMAP likelihood) best-fit values: Q m = Qdm + 
n b = 0.27, n A = 1 - Hm, Ob/i 2 = 0.02249, h = 0.704 and 
dn s /dlnfe = (Lar son et al. 2010T ). Note that, here in all cases, 
r e i is a derived parameter and the error on obtaining this quantity 
is slightly underestimated because of neglecting the degeneracies 
between r e i and other cosmological parameters. 



2.3 Brief theory of PCA 

In this section, we outline the principal component method and in- 
troduce the notation that we will use throughout the paper. As has 
been described in Paper I, the principal components filter out com- 
ponents of the model which are most sensitive to the data. In order 
to determine the principal components of Ni on (z), we consider the 
data for photionization rate Tpi, the redshift distribution of Lyman- 
limit systems dN^/dz and the large-scale E-mode polarization 



angular power spectrum C ; (I < 23). 

We represent the unknown function N- 10 
discrete free parameters with the bin width 



"bin 



(2) by a set of ribin 



(4) 



L oc exp(— £) 



(2) 



http://cosmologist.info/cosmomc/ 
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We have taken a redshift range [z m i n : z m ax] = [0 : 30] and 
Az = 0.2 (i.e. ribin = 151). Then we construct the Fisher matrix 



= E 



(5) 



where Q a , a = 1, 2, ...,n bs represent the observa- 
tional data points (which in our case is given by Q a — 
{log(r P i),diVLL/dz, Cf E }), C/* h is theoretical value of Q a and 
iV^n is the fiducial model which is, in principle, close to the under- 
lying "true" model. In this work we take the fiducial model 
to be the model which matches the Fpi, dN^/dz and CMB data 
points up to an acceptable accuracy and also which is characterized 
by a higher Mon at higher redshifts. The match with the data for 
our fiducial model is similar to Figure 2 of |Choudhury (2009| > and 
Figure 1 of Paper I. 

Once the Fisher matrix is constructed, we can determine 
its eigenvalues and corresponding eigenvectors. Because of the 
orthonormality and completeness of the eigenfunctions, we can 
expand the deviation of Ni on from its fiducial model, SNi — 
Mon(zi) - JV£U«), as 



SN, 



= ^2 m k S k {zi 



(6) 



where Sk(zi) are the principal components of Ni on (zi) and mfc are 
the expansion coefficients. The advantage is that, unlike Ni on (zi), 
the coefficients are uncorrected with variances. 

In realistic situations, there will be other free parameters (apart 
from mk or SNi) in the model. Let there be n cxt number of extra 
parameters other than rrih', this means that we are now dealing with 
a total of n to t = n^ n + n cyLt parameters. In this case, we can still 
form the Fisher matrix of n to t x n to t dimensions which can be 
written as 



T ■ 



F 



B 
F' 



(7) 



where F is the ribm x ribm -dimensional Fisher matrix for the SNi, 
F' is the n ex t x n cxt -dimensional Fisher matrix for the other pa- 
rameters and B is a ribm x n cxt -dimensional matrix containing the 
cross-terms. One can then invert the above T to obtain the cor- 
responding Hessian matrix T = T~ x . Following that, one simply 
retains the sub-block T corresponding to SNi whose principal com- 
ponents will be "orthogonalized" to the effect of the other parame- 
ters. The resulting "degraded" sub-block will be iPress ^Tal. 1992t 



F = T _1 = F-BF' 1 B T 



(8) 



In this work we need to use the above formalism to marginal- 
ize over the normalization of the mean free path Ao, cosmological 
parameters ro s and ag. So, in this case, n cxt = 3. 

It can be shown that the largest eigenvalues correspond to min- 
imum variance and vice versa. Hence, most of the information rele- 
vant for the observed data points is contained in the first few modes 
with larger eigenvalues. We can then reconstruct the function SNi 
using only the first M < ribm modes. So, the important step in this 
analysis is to decide on how many modes M to be used. If we in- 
clude all the robin modes, then no information is thrown away, but 
the errors in the recovered quantities would be very large due to 
the presence of very small eigenvalues. On the other hand reduc- 
ing M can reduce the error but it may introduce large biases in the 
recovered quantities. 

One possible approach is to use the trial-and-error method to 
fix M, i.e. assume an underlying model which is different from 



the fiducial model but matches the current data sets quite accu- 
rately and study its recovery using only first few modes. We re- 
fer the reader to our earlier paper for a detailed discussion about 
this approach. A slightly more formal approach is to estimate M 
by minimizing the quantity Risk which is essentially the sum of 
the bias contribution which arises from neglecting the higher order 
terms, and the error (given by Cramer-Rao bound) arising because 
of higher order terms being included. We have checked that the 
quantity Risk has a clear minimum at M = 8 for our present case. 

However, both methods described above, involve the assump- 
tion of an "underlying model", hence the determination of M using 
this method would be model-dependent. An alternate prescription 
is to use Akaike information criterion ( Liddle 2007 1 



AIC = xLn + 2M 



(9) 



where smaller values are assumed to imply a more favored model. 
Similarly, one can also use the Bayesian information criterion de- 
fined by BIC = Xmin + Mlnn b s . The utility of these criteria 
over the Risk is that they are computed without knowing the under- 
lying solution (Clarkson & Zunckel 2010). The results using BIC 
typical give smooth reconstructions by underestimating the errors. 
The AIC, on the other hand, renders more featured reconstructions 
at the expense of large errors. However, as n ba is fixed for our cur- 
rent analysis, the minimum value of AIC corresponds to the min- 
imum of BIC, hence we simply carry out our analysis with only 
AIC. Note that there is no reason to select one particular recon- 
struction, the minimum of AIC can be accompanied by an increased 
chance of getting the reconstructed parameters wrong. According 
to |Clarkson & Zunckel (20l0l l, one successful strategy is to select 
different M which are near the minimum value of AIC and amal- 
gamate them equally at the Monte Carlo stage when we compute 
the errors. In this way, we can reduce the inherent bias which exists 
in any particular choice of M. We have examined that, in our case, 
the family of different M reconstructions, starting from M = 2, 
which satisfy 



AIC < AlCmin + « 



(10) 



where k = 10 (which corresponds to M — 8), produces very 
solid results. For alternative data sets, the value of k can be ad- 
justed. The choice of this parameter must be treated as a prior. 
The importance of using the AIC is that the analysis now becomes 
non-parametric. The method has been successfully used in recon- 
structing the dark energy equation of state using SN-Ia observations 
Clarkson & Zunckel 20l0l 



3 RESULTS 

3.1 The principal components of N on (z) 

The properties of the Fisher matrix Fij, obtained using equation 
((5), were discussed in detail in Paper I and they remain essentially 
the same. After diagonalizing Fij, we obtain its eigenvalues and 
the corresponding eigenmodes. In FigureQ] we show the inverse of 
the eigenvalues i.e., the variances of the corresponding modes. We 
have verified that the first 5 eigenvalues here are almost the same 
as those we got in our previous work. Interestingly, we get here few 
more eigenvalues which have considerably high values and hence 
they can not be ignored. However, one can see that 7 and 8 modes 
contain less useful information than the first 6 modes. But we have 
to check first whether we can simply neglect them or not, because 
neglecting 7 and 8 modes may introduce large biases in the recov- 
ered quantities. For that, we have used more than one method to 
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Figure 2. The first 8 eigenmodes of the Fisher matrix. 




Figure 1. The inverse of eigenvalues of the Fisher matrix F^j which essen- 
tially measures the variance on the corresponding coefficient. 



fix M(as described in the earlier section) and each method sug- 
gests that we should keep upto M — 8 modes in our analysis un- 
like the case for Paper I, where we got the optimum value of M 
is five. This is because the C;'s contain somewhat more informa- 
tion than what is contained within a single data point r e i. This fact 
can be noted from the plot of the first 8 eigenmodes (i.e., those 
which have the lowest variances) plotted in Figure[2] The first five 
modes are similar to what was obtained in Paper I. However, the 
modes 6 to 8 in Paper I did not contain any information, while in 
this case they show the sensitivity of JVi on (z) on different angular 



scales I. We find that all the eigenmodes tend to vanish at z > 15, 
which is obvious because of Fij being negligible at these redshifts. 
We can see a number of spikes and troughs in the first four modes 
whose positions correspond to the presence of data points for Tpi 
and dA^LL /dz at 2 < z < 6. The last four modes contain the infor- 
mation about the sensitivity of Cf B . This sensitivity is maximum 
around z ~ 7 — 8 and decreases at z > 8 due to unavailability of 
free electrons; it also decreases at z < 7 because of the fact that 
reionization is mostly completed at these redshifts (x e — s> 1) and 
hence changing Mon does not affect the value of C EE significantly 
at this redshift range. The modes (> 8) with smaller eigenvalues 
i.e. large variances introduce huge uncertainties in the determina- 
tion of JVion and hence do not contain any meaningful information 
about the reionization history. 



3.2 Markov Chain Monte Carlo Constraints from WMAP7 
data 

The constraints on reionization are obtained by performing a 
Monte-Carlo Markov Chain (MCMC) analysis over the parameter 
space of the optimum number of PC A amplitudes, Ao, n s and as- 
Other cosmological parameters are kept fixed to the WMAP7 best- 
fit values (see Section |2~2l . To avoid the confusion about the correct 
choice of number of modes, we perform the MCMC analysis for 
PCA amplitudes taking from M — 2 to M = 8, all of which obey 
the AIC criterion (equation |10t . We then weight each choice of 
M equally and fold the corresponding errors together to reproduce 
Mon and other related quantities along with their effective errors. 
In order to carry out the analysis, we have developed a code based 
on the publicly available COSMOMC flLewis & Bridle 2002) . We 
run a number of separate chains (varying between 5 to 10) until the 
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Parameters Mean value 95% confidence limits 

Tgi 0.093 [0.080,0.112] 

z{Qhu = 0.5) 10.206 [8.952, 11.814] 

z(Qhii = 0.99) 7.791 [5.800,10.427] 



Table 2. The marginalized posterior probabilities with 95% C.L. errors of 
all the derived parameters for the reionization model obtained from the cur- 
rent analysis using AIC criterion for WMAP data. 



Gelman and Rubin convergence statistics, R, corresponding to the 
ratio of the variance of parameters between chains to the variance 
within each chain, satisfies R— 1 < 0.01. Also we have used the 
convergence diagnostic of Raftery & Lewis to determine how much 
each chain must be thinned to obtain independent samples. Both of 
these are computed automatically by COSMOMC. 

We have shown the evolution of various quantities related to 
reionization using the AIC criterion for M = 2 to M = 8 in fig- 
ure [3] The solid lines represent the mean model while the shaded 
region correspond to 95% confidence limits. For comparison, we 
have also plotted the fiducial model (short-dashed) as described in 
Section |2~3l We find that the fiducial model is within the 95% con- 
fidence limits for the whole redshift range. Note that all the quan- 
tities are highly constrained at z < 6, which is expected as most 
of the observational information related to reionization exists only 
at those redshifts. The errors also decrease at z > 14 as there is 
practically no information in the PCA modes and hence all models 
converge towards the fiducial one. The most interesting informa- 
tion regarding reionization is concentrated within a redshift range 
6 < 2 < 14. 

It can be seen from the plot of iVi on (z) {top-left panel of figure 
that such quantity must necessarily increase from its constant 
value at z < 6 which confirms our findings from Paper I. This rules 
out the possibility of reionization with a single stellar population 
having non-evolving IMF and/or star-forming efficiency. The main 
difference from our previous results is that the allowed ranges in 
Mon at redshifts 7 < z < 12 has reduced significantly (earlier, 
values of Mon as large as 250 were allowed around z « 7.5, while 
the maximum allowed value has been reduced to ~ 100 in this 
work). While some of these constraints arise from the observation 
of Lyman-limit systems at z ~ 6, the major effects arise due to the 
inclusion of Ci 's into the analysis. This again confirms the fact that 
Ci's have more constraining power than r e i taken as a single point. 

The same conclusion can be drawn from the plot of Fpi(z) 
(top-middle panel), where we find that the maximum allowed value 
is ~ 10 -11 s _1 . This is nearly 10 times more stringent than what 
was allowed in Paper I. We find that the mean model is consistent 
with the observational data at z < 6, as expected. The errors cor- 
responding to 95% confidence limits are also smaller at this epoch. 
The photoionization rate for the fiducial model shows a smooth rise 
at z > 6 reaching a peak around z w 11; however, the model de- 
scribed by the mean values of the parameters shows a much sharper 
rise and much prominent peak around z ~ 6.5. The prominent 
peak-like structure is also present in the plot of dN^/dz (top- 
right panel). 

From the plot of Qhii(z) (bottom-left panel), we see that the 
growth of Qhii(z) for the mean model is much faster than that 
of fiducial model at initial stages, though the completion of reion- 
ization takes place only at z ~ 6. One can also find that reion- 
ization can be completed as early as z ~ 10.4 (95% confidence 
level). Similarly, xm(z) (bottom-middle panel) decreases much 



faster than the fiducial one at 6 < z < 12 and then smoothly 
matches the Lya forest data. 

Finally, we have shown the values of (a) C i TT , (b) C™ and 
(c) C EB for the mean model in the bottom-right panel of this fig- 
ure, which is almost the same as the fiducial model. So the current 
WMAP7 EE polarization data alone cannot distinguish between the 
various models of reionization. One can see that, our mean model 
includes most of the current WMAP7 best-fit CMB data within 
the error bars, except for a few C EE data points. Note that these 
discrepant points at I > 15 cannot be reconciled by any physical 
reionization model, implying that the spectra contribution might 
come from some other cosmological process, as e.g. gravitational 
lensing. 

The mean values and the 95% confidence limits on the pa- 
rameters obtained from our analysis are shown in the Table [2] We 
have checked that, our fiducial model which is characterized by 

mi = 7712 = 7773 = "74 = TH-b = 7776 = 7777 = 7778 = and 

the best-fit values of Ao, n s and erg, is included within the 95% 
confidence limits of those parameters corresponding to our current 
analyses using AIC criterion. We find that reionization is 50% com- 
plete between redshifts 9.0 - 11.8 (95% confidence level), while it 
is almost (99%) complete between redshifts 5.8 - 10.4 (95% confi- 
dence level). These values are similar to what was obtained in Paper 
I. Note that the lower limit on the redshift of reionization (5.8) is 
imposed as a prior on the parameters. Here the mean model for 
r e i shows a higher value than the best-fit WMAP7 value which is 
arising from relatively complex reionization histories giving non- 
zero ionized fractions at high redshifts. The value of T e t obtained is 
slightly lower than what we got in our earlier work, where we in- 
cluded r e i as a single data point instead of considering CMB large- 
scale EE polarization data which is because many models with very 
high A?i on are ruled out in this work. 

We have checked that, if we take any particular choice of M, 
say M = 7 or 8, our main findings are almost the same as the above 
results, except with the help of AIC criterion, we have reduced the 
inherent bias which is present for that specific choice of M and got 
a mean model which matches the current data sets quite reasonably. 

To summarize, we find that using C EE data set instead of 
Tei, we can get a relatively smaller error for Ni on (z) (see Figure 7 
of |Mitra, Choudhury & Ferr ara 20101 but get a t c i which is higher 
than the current WMAP value. So a wide range of reionization his- 
tories is still allowed by the data we have used. Reionization can 
be quite early or can be gradual and late, depending on the behav- 
ior of Ni on (z). Hence, using these data, it is somewhat difficult to 
put strong constraints on chemical feedback and/or the evolution of 
star-forming efficiencies and/or escape fractions. 

3.3 Markov Chain Monte Carlo Constraints from simulated 
PLANCK forecast data 

Given that the current data allow a large range of reionization mod- 
els, it is worthwhile computing the level of constraints expected 
from future large-scale polarization measurements like those ob- 
tained from PLANCK. To forecast the errors for parameters re- 
lated to the reionization history, we first generate the simulated 
PLANCK data of CMB power spectra for our fiducial model up to 
I < 2000 using the exact full-sky likelihood function at PLANCK- 
like sensitivity (Perotto et al. 2006; Galli e fal. 2010t . We assume 
that beam uncertainties are small and that uncertainties due to fore- 
ground removal are smaller than statistical errors. More sensitive 
observations will also require an exact analysis of non-Gaussian 
likelihood function, here for simplicity we assume isotropic Gaus- 
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Figure 3. The marginalized posteriori distribution of various quantities related to reionization history obtained from the PCA using the AIC criterion with 
first 8 eigenmodes. The solid lines correspond to the model described by mean values of the parameters while the shaded regions correspond to 2-cr limits. 
The points with error-bars denote the observational data points. Top-left: the evolution of the effective N- loIL (z); Top-middle: the hydrogen photoionization 
rate Tpj(^) along with the constraints from Bolton & Haehnelt (2007); Top-right: the LLS distribution dN^/dz with data points from Songaila & Cowie 
(2010); Bottom-left: the volume filling factor of HII regions Qhii(z); Bottom-middle: the global neutral hydrogen fraction xm(z) with observational limits 
from QSO absorption lines (Fan et al. 2006; filled square), Lya emitter luminosity function (Kashikawa et al. 2006; open triangle) and GRB spectrum analysis 
(Totani et al 2006; open square). Also shown the constraints using dark gap statistics on QSO spectra (Gallerani et al 2008a; open circles) and GRB spectra 
(Gallerani et al. 2008b; filled circle); Bottom-right: (a) TT, (b) TE and (c) EE power spectra with the data points from WMAP7 (Larson et al. 2010). In addition, 
we show the properties of the fiducial model (short-dashed lines) as described in Section l2~3l 



Parameters 




2-a errors 




WMAP7 


PLANCK (forecast) 


Tel 


0.032 


0.009 


z(Q HU = 0.5) 


2.862 


1.117 


z(Qhii = 0.99) 


4.627 


3.013 



Table 3. The 95% C.L. errors of derived parameters for the reionization 
model obtained from the current analyses using AIC criterion for WMAP7 
and simulated PLANCK data. 

sian noise and neglect non-Gaussianity of the full sky (Lewis 2005 ) 
and try to see what we can learn about the global reionization his- 
tory from PLANCK-like sensitivity. We then repeat the MCMC 
analysis over the same parameter space of Section [3~2l using this 
simulated data. Like the previous case, here we have also varied 
the number of modes included in the analysis from two to eight 
using the AIC criterion in order to study the effect of truncating 
the PCA expansion for the recovery of various quantities related to 
reionization. 

In the Table|3] we have shown the comparison of the 2-a errors 
on the derived parameters obtained for currently available WMAP7 
data and the same for forecasts from simulated PLANCK data. It is 



clear that the uncertainties on all the parameters related to reioniza- 
tion would be reduced considerably. In particular, we find that we 
should be able to constrain the redshift range at which reionization 
was 99% (50%) completed to about 3 (1). This is clearly a signifi- 
cant improvement over what can be achieved through current data 
sets. 

In Figure [4] we have illustrated the recovery the same quan- 
tities as mentioned in the earlier section using the AIC criterion 
taking up to 8 eigenmodes for the simulated PLANCK data. For 
comparison, here also we have plotted the results for the fiducial 
model (short-dashed lines) along with the mean results (solid lines) 
from MCMC analysis with shaded 2-a limits. We find that our main 
results are in quite reasonable agreement with those obtained from 
the WMAP data (Section[3~2l, except that all the 2-a (95 %) limits 
are reduced remarkably for all redshift range. 

We thus find that we can constrain the global reionization his- 
tory quite better using the PLANCK forecast data sets, especially 
the 2 — a limits for Qhii reduces significantly for this case. How- 
ever there is no room to substantially improve the constraints using 
large-scale is-modes for WMAP7 data sets and one still has to rely 
on other types of data for understanding reionization. 
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Figure 4. Same as Figure[5]but for planck likelihood 



4 DISCUSSION AND SUMMARY 



Based on the work of |Mitra, Choudhury & Ferrara (20 10) on prin- 
cipal component analysis of reionization model, we have studied 
constraints on reionization history using non-parametric methods. 
To model the unknown function Ni on (z), we have applied the prin- 
cipal component method using three different sets of data points 
- the photoionization rate rpi(z), the LLS distribution dN^/dz 
and current WMAP data for Cf E for I < 23. Following that, we 
have obtained constraints on the reionization history using MCMC 
techniques. We have also used the Akaike information criteria 
(AIC) to extract the underlying information about the PCA model 
and reduce the intrinsic bias present in any particular choice of fidu- 
cial model. We have applied our method to the currently available 
WMAP7 data as well as the simulated PLANCK data to forecast 
future errors on reionization. 

Our main findings can be summarized as follows - 

(i) We have found that the information about Ni on (z) or equiv- 
alently the star formation and/or chemical feedback lies in the first 
eight eigenmodes of the Fisher information matrix distributed over 
the range 2 < z < 14. Using the higher modes costs higher errors. 

(ii) The angular power spectra Ci of CMB observations contain 
more information than treating r e i as a single data point. This is ob- 
vious from the analysis of the Fisher matrix and results in (slightly) 
more stringent constraints on Ni on (z) and Tpi(z). 

(iii) The constraints at z < 6 are relatively tight because of the 
QSO absorption line data. On the other hand, a wide range of histo- 
ries at z > 6 is allowed by the data. Interestingly, it is not possible 
to match the available data related to reionization with a constant 



Ni on (z) over the whole redshift range, it must increase at z > 6 
from its constant value at lower redshifts. 

(iv) With currently available data from WMAP7, we constrain 
0.080 < r e i < 0.112 (95% CL) and also conclude that reionization 
is 50% complete between 9.0 < z(Qhii = 0.5) < 11.8 (95% CL) 
and is 99% complete between 5.8 < z(Qhii = 0.99) < 10.4 
(95% CL). 

(v) With the forthcoming PLANCK data on large-scale polar- 
ization (ignoring effect of foregrounds), the z > 6 constraints 
will be improved considerably, e.g., the 2 — a error on r e i will 
be reduced to 0.009 and the uncertainties on z(Qhii = 0.5) and 
z(Qmi = 0.99) would be ~ 1 and 3 (95% CL), respectively. The 
errors could be somewhat larger if the effect of foregrounds are 
incorporated into the analysis. For more stringent constraints on 
reionization at z > 6, one has to rely on data sets other than CMB. 

Finally, we try to indicate the data sets (other than CMB) 
which can possibly be used to better the constraints on reionization. 
Since most of the information on reionization at z < 6 come from 
QSO absorption lines, it is natural to expect more constraints from 
such observations at z > 6. In addition, spectra of GRBs, which 
are being observed at much higher redshifts ( Salvaterr a et al. 20091 
ITanvir et al. 2009 , Cucchia ra et al. 201 It could also provide addi- 
tional constraints. The difficulty is that the transmission regions 
(which are the sources for most of the information) are almost 
non-existent at high-z spectra, thus making the analysis more diffi- 
cult. Additional constraints on xwi at high redshifts are expected 
from Lya emitters I Taniguchi et al. 2005| [ Kashikawa^e t al. 2006] 
|Iye et al. 2006) IVanzella et al. 20101 ILehnert et al. 2010t , however 
they too are affected highly by systematics. On the positive side, 
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we feel that even a relatively weak constraint on xm at z ~ 7 — 10 
could be crucial in ruling out a subset of reionization models as the 
value of Nion(z) is most uncertain at these redshifts. 

We also now have observations of Lyman-break galax- 
ies till z ~ 10 IBouwens et al. 20071 IBouwens et al. 20101 
IBouwens et al. 20TQ . The luminosity function of such galaxies 
would be helpful in constraining properties of the galaxies like the 
IMF and/or the star-forming efficiency. Unfortunately, that would 
still leave out the escape fraction of ionizing photons, which remain 
an uncertain parameter till date. 

Other indirect observations that could help in constrain- 
ing reionization are the temperature measurements at z < 
6 <|Schaye et al. 2000| IRicotti et al. 20001 IMcDonald et al. 20011 
Zaldarriaga et al. 2001] ICen et al. 2009t . The temperature evolu- 
tion can retain memory of how and when the IGM was reion- 
ized and thus could provide additional constraints on reionization. 
Whatever be the case, the principal component method described in 
this paper, could be a promising tool for extracting the information 
from the future data sets in a model-independent manner. 
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